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Title of the Invention 



Efficient Compression Using Differential Caching 



Background of the Invention 



Field of the Invention 



This invention relates to reducing the bandwidth and computing resources 



19 required to transmit a web page over a network. 

20 

21 
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1 2. Related Art 
2 

3 It is desirable to transmit web pages in such a way as to minimize the 

4 bandwidth and other computing resources required to transmit the web page from a 

5 server device to a client device. 
6 

7 One popular technique for minimizing bandwidth requirements and com- 

8 puting resources is to store the web pages at multiple mirroring servers throughout the 

9 network. By "pushing content to the edge of a network" (that is, caching the content and 
^ 10 serving it from a mirroring server), it is possible to minimize the distance (that is, geo- 

m 

y 1 1 graphical distance or as measured by network topology) that information must travel be- 

□ 12 fore reaching a destination client device. Although redistribution of the load among mul- 
13 tiple mirroring servers improves the quality of service by minimizing distance, it does not 

m 

□ 14 necessarily affect the bit rate at which a web page is transmitted. 

□ 15 

16 Another technique for minimizing bandwidth requirements involves com- 

17 pression of the web page. Compression generally involves the use of a computer pro- 

18 gram such as gzip, glib or some other similar program. Since HTML is very compressi- 

19 ble (for example, the use of glib can result in 73.8 % compression of HTML), there is a 

20 significant minimization in the amount of bandwidth required for transmission of web 

21 pages. One drawback to this technique however, is that compressing each web page is a 

22 relatively inefficient use of computing resources. Every time a web page is sent to a user, 
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1 the page must be compressed, regardless whether the page was compressed in the past. 
For example, every time an organizational home page is sent from a server to a client de- 
vice, computing resources are devoted to compression of that home page. In a second 
example, every time a server sends similar pages (such two different instances of an order 
form from an on-line retailer), those pages must be separately compressed without regard 
6 for their similarity. 

Accordingly, it would be desirable to provide a technique for serving rela- 
tively non-static content for delivery in a content delivery network. 

Summary of the Invention 

In a first aspect of the invention, a template for a web page or for a set of 
web pages is identified and compressed at an originating server. A template can include 
either (1) an entire web page or (2) those elements in a web page that are relatively un- 



• Entire web pages 

• Order form pages without the personalized information 

• Stock update pages without the ticker information 
Product specs without the product descriptors 

Ticket sale pages, without information about a particular event 
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1 Portions of comparable web pages wherein a portion of the web page does not 

2 change frequently. 

3 

4 A Ubrary of these compressed templates is stored at the originating server and the mir- 

5 roring servers so as to make it available for subsequent requests. Generally, compression 

6 of the template only occurs once. The template does not need to be compressed again, 

7 unless changes to the template itself are made (for example, changes to the style of the 

8 template). In this respect, the invention is very different from the prior art of HTML 
_ 9 compression which requires compression of a web page every time the page is sent. By 

10 minimizing the number of times that compression takes place, fewer computational re- 

sua = 

UJ 1 1 sources need be allocated toward compression operations. In a preferred embodiment, 

Q 12 the compressed template is cached and served from a remote mirroring server. In other 

□ 13 embodiments, the compressed template is cached and served from the originatmg server. 

□ 14 When a client device requests a web page, the compressed template is sent from either 
y 15 the mirroring server (if the template is present) or from the originating server (if the tem- 

16 plate is not present at the mirroring server) by way of the mirroring server to the client 

17 device. 
18 

19 In a second aspect of the invention, delta information from a set of web 

20 pages is identified and compressed at the originating server. Unlike templates which in- 

21 elude substantial similarities, delta information refers to a set of differences such as dif- 

22 ferences between instances of a particular web page. For example, in a web page of a 
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1 completed order form, the web page may be divided into (1) a template, including the un- 

2 changing features of the page (or the entire page if there is no dynamic information in the 

3 page); and (2) delta information comprising specific personalized information pertaining 

4 to an individual order. In a preferred embodiment, the delta information is compressed at 

5 the originating server and sent from the originating server to the client. In a preferred 

6 embodiment, delta information is compressed each time is it transmitted. Given that the 

7 delta information is usually relatively small, the amount of computational resources de- 

8 voted to its compression is also relatively small. This selective compression minimizes 

9 the bandwidth needed to transmit the page while simultaneously minimizing the com- 
% 10 puting resources allocated for compression. In other embodiments, the delta information 

y 1 1 may be compressed once and cached until such time that the compressed delta informa- 

W 

^ 12 tion is reused. This may be useful when the delta information includes advertising copy 

g 1 3 that may be repeated again. 
O 14 

p 15 Incorporated Disclosures 

16 

1 7 The invention described herein can be used in conjunction with inventions 

1 8 described in the following applications: 
19 

20 • Apphcation Serial No: 09/888,374; filed June 22, 2001 , in the name of Stephane Kas- 

2 1 riel, titled "Content Delivery Network using Differential Caching" attorney docket 

22 number 155.1005.01. 

Express mailing EL7348 1 5879US 5 



155.1006.01 



1 This application is hereby incorporated by reference as if fully set forth 

2 herein. They are collectively referred to as the "incorporated disclosures". 
3 

4 BRIEF DESCRIPTION OF THE DRAWINGS 

5 

6 Figure 1 shows a block diagram of a system for efficient compression using 

7 differential caching. 
8 

9 Figure 2 shows a flow diagram of a method for efficient compression using 

10 differential caching. 

n 11 

5 12 Figure 3 shows a data flow diagram in a system for efficient compression 

13 using differential caching. 

i 

□ 1 5 Figure 4 shows a data flow diagram for efficient compression using differ- 

16 ential caching using a proxy encoder server. 
17 

18 Description of the Preferred Embodiment 

19 

20 In the following description, a preferred embodiment of the invention is de- 

21 scribed with regard to preferred process steps and data structures. Those skilled in the art 

22 would recognize after perusal of this application that embodiments of the invention can 
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1 be implemented using one or more general purpose processors or special purpose proces- 

2 sors or other circuits adapted to particular process steps and data structures described 

3 herein, and that implementation of the process steps and data structures described herein 

4 would not require undue experimentation or further invention. 
5 

6 Lexicography 
7 

8 • Originating Server - as used herein, an "originating server" takes on the role of a 

9 server in a client-server relationship and is the original provider of content to a client 
10 device or to a mirroring server. 

Id 11 

O 12 • Mirroring server - as used herein, a "mirroring server" includes any device that 
L 13 takes on the role of a server in a client-server relationship and that receives requests 
3 14 from client devices and responds to those requests by sending content that originated 
O 15 (in whole or in part) at an originating server. 
16 

1 7 • Template- as used herein, the term "template" refers to a selected portion of a web 

18 page that is relatively unchanging. If there is no difference between different in- 

19 stances of a web page, then the entire page may be a template. 
20 

21 • Delta information - as used herein, the term "delta information" refers to a selected 

22 portion of a web page varies between instances of the web page. 
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1 System Elements 
2 

3 Figure 1 shows a block diagram of a system for efficient compression using 

4 differential caching. 
5 

6 A system for efficient compression using differential caching (shown by 

7 general character reference 100) includes one or more client devices 110 under the con- 

8 trol of user 1 12, an originating server 120, a set of mirroring servers 130 and a communi- 

9 cation network 140. A proxy encoder 150 may be positioned "in front" of the originating 
5 10 server 120. 

O 12 Each client device 110 includes a processor, an input element, a presenta- 

13 tion element, a local memory and system software. Client devices 110 further include 

□ 14 software 114 disposed for communicating with the communication network 140 and 

O 15 software 1 1 6 for integrating a web page. 
16 

17 In a preferred embodiment, each client device 110 includes a general- 

18 purpose computer, such as a laptop or workstation. However, a client device 110 can 

19 also include (either alone or in conjunction with a laptop or workstation), a hand-held 

20 calendar (such as a "Palm Pilot" or other hand-held device), a portable computer, a spe- 

21 cial purpose computer, a cellular telephone or other telephonic device, a web server act- 

22 ing as the agent for a user, or another device. In altemative embodiments, a client device 
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1 110 may also include any other device disposed for performing all or some of the func- 

2 tions described herein. 
3 

4 The software 1 14 for integrating a web page includes elements for decom- 

5 pressing a template, decompressing delta information, and integrating the template and 

6 delta information into a unified presentation for the user 112. 
7 

8 In some embodiments, the client device 110 also includes a decoder 118, 

_ 9 preferably as a browser "plug-in". However, the decoder 1 1 8 may also be situated in 

% 10 other locations downstream from the originating server 120, such as at a cache or firewall 

yj 1 1 associated with an ISP or at the edge of an enterprise network. The decoder 1 1 8 specifies 

5 12 that it accepts delta encoding by adding information to the HTTP header In such em- 

□ 13 bodiments, the decoder 118 may also perform the functions performed by software 114, 

D 14 such as integrating information. 

16 Those embodiments of the invention that do not include a decoder 1 18 are 

17 known as "clientless". In such instances, the server sees that the request does not come 

18 fi:om a decoder 118 and serves deltas in the form of javascript instructions and a refer- 

19 ence to the template. The scripting capabilities of the client's browser are directed to- 

20 ward applying the delta to the template and displaying the HTML page. In other em- 

21 bodiments, other scripting techniques may also be used. 
22 
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1 The originating server 120 is under the control of an entity that provides 

2 web pages for users 112. Similar to the client devices 110, the originating server 120 in- 

3 eludes a processor, an input element, a presentation element, system software and a local 

4 memory. However, unlike the client device 1 10, the originating server 120 includes a 

5 cache of the complete set of compressed templates 122, a cache of delta information 124, 

6 compression software 126 (for example gzip, glib or another comparable program) and 

7 web server software 128. The compression software 126 is used to compress both tem- 

8 plate information and delta information. 
9 

10 The compressed templates included in the cache of the complete set of 

m 

y 1 1 compressed templates 122 are derived from web pages that include relatively unchanging 

W 

O 12 elements, such as the backdrop in weather pages, blank charts used in stock pages, in- 

^ 13 complete order form pages and similar web pages with relatively unchanging elements. 

□ 14 If a web page does not include dynamic elements, the compressed template may include 

p. 15 the entire web page. In a preferred embodiment, the template is compressed and cached 

16 with the complete set of compressed templates 122. It does not have to be compressed 

17 again unless the information itself is updated (for example, with stylistic changes). 
18 

19 The delta information in the cache of delta information 124 includes those 

20 portions of a set of web pages that are highly variable. This delta information can be de- 

21 rived from web pages that include elements that change frequently such as a weather 

22 forecast, the value of a stock, information to complete an order form and comparable in- 
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1 formation that is either very ephemeral or unique to a particular user 112. If the total 

2 number of sets of delta information 124 associated with a particular URL is relatively low 

3 (for example, a few different rotating banner advertisements) and that information is 

4 relatively small, the server can maintain a table mapping uncompressed deltas to com- 

5 pressed deltas. This saves computing resources because there is no need to re-compress 

6 the delta information every time it is served. Optimal benefits are obtained if the delta 

7 information is requested many times before the template changes. 
8 

_ 9 The set of mirroring servers 130 is usually under the control of the same 

'"S 10 entity that controls the originating server 120. Similar to the originating server 120, the 

yj 1 1 set of mirroring servers 130 includes a processor, system software, an input element, a 

D 12 presentation element, a local memory, a cache of compressed web templates 132, com- 

g 13 pression software 134, and web server software 136. However, unlike the originating 

□ 14 server 120, the cache of compressed web page templates 132 at the mirroring server 130 

15 is not necessarily complete. Moreover, compression software 134 at the mirronng server 

16 130 is generally used to compress delta information rather than templates as does the 

17 compression software 126. In a preferred embodiment, the mirroring servers are rela- 

18 tively more local to the client devices 1 10 than the originating server 120. 
19 

20 As noted supra, some embodiments of the system 100 also include a proxy 

21 encoder 150. In these embodiments, the proxy encoder 150 is coupled to the originating 

22 server 120, either as a separate server or as an encoder coupled to the originating server. 
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1 The proxy encoder server 150 compares information stored locally or at the client to 

2 possibly fresher information from the mirroring server 130 or the originating server 120 

3 and serves the compressed template and delta information to the client 110. 
4 

5 The communication network 140 is disposed for transporting compressed 

6 templates, delta information and requests for web pages between the client devices 110, 

7 the originating server 120 and the mirroring server 130. In a preferred embodiment, the 

8 communication network 140 includes a packet switched network such as the Intemet, as 

9 well as (in conjunction with or instead of) an intranet, an enterprise network, an extranet, 

10 a virtual private network, a virtual switched network, or a wireless network. In altema- 

1 1 tive embodiments, the communication network 140 may include any other set of commu- 



5 12 nication links that couple the client devices 1 10 the originating server 120 and the mir- 

p 13 roring servers 130. 
□ 14 

^ 15 Method of Operation 
16 

1 7 Figure 2 shows a flow diagram of a method for efficient compression using 

18 differential caching. 
19 

20 A method 200 for efficient compression using differential caching is per- 

21 formed by the system 100, including a set of client devices 1 10, an originating server 

22 120, a set of mirroring servers 130 and a communication network 140. 
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Although described serially and in a particular sequence, in a preferred em- 
bodiment the steps described herein can be performed concurrently or in parallel by the 
system elements, or could be performed in a different sequence or some combination 



At a flow point 205, a user 1 12 is ready to request a web page from a mir- 
roring server 130. In a preferred embodiment, the mirroring server 130 is relatively more 
8 proximate to the user 112 than the originating server 120. 

At a step 210, the user 112 causes the client device 110 to generate a re- 



in other embodiments, the client device 110 may request the web page from 



When the request is made, the client device 110 indicates whether it can 

20 receive delta information. If the client device 110 can receive delta information, the 

2 1 method proceeds with step 215. 
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1 At a step 215, the mirroring server 130 determines if a compressed template 

2 for the web page is available in the cache of compressed templates 132, If necessary, the 

3 mirroring server 130 obtains the compressed template from the originating server 120. If 

4 the compressed template is unavailable from the originating server 120 (this may be the 

5 case if the template has not been requested before, or if the template has been recently 

6 changed), a template is created and compression software 126 at the originating server 

7 120 compresses the template. The newly compressed template is sent from the originat- 

8 ing server 120 to the mirroring server 130 where it is stored in the cache of compressed 
^ 9 templates 132. 

1 

yj 11 At a step 220, the mirroring server 130 transmits the compressed template 

m 

9 12 to the client device 110. 

h 13 

□ 14 If the mirroring server 130 were bypassed in step 210, the compressed tem- 

Q 15 plate is sent to the client device 1 10 from the originating server 120. 
16 

17 At a step 225, the client device 1 10 or the mirroring server 130 (depending 

18 upon the configuration of the system 100) transmits a message to the originating server 

19 120 for the delta information. 
20 

21 At a step 230, the originating server 120 identifies the delta information in 

22 the cache of delta information 124 and compresses it using the compression computer 
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1 program 126. Depending on the size and type of delta information 124, this compressed 

2 delta information may be stored at the originating server 120 or the mirroring server 130. 

3 This is particularly useful if there is a strong likelihood that the delta information will be 

4 requested again. 
5 

6 At a step 235 the originating server 120 transmits the compressed delta in- 

7 formation to the client device 110. 
8 

_ 9 At step 240 the client device 110 integrates the template and delta by per- 

10 forming the following substeps: 

W 11 

m 

3 12 • At a substep 240(a), the template is decompressed. 

g 13 • At a substep 240(b), the delta information is decompressed. 

□ 14 • At a substep 240(c), the decompressed template and decompressed delta are 

^ 15 integrated so as to form a complete web page that is presented to the user 1 12. 



16 

17 Given that the software 1 14 for integrating a web page is present at the client device 110, 

18 the resources required to decompress the delta information and template and integrate 

19 them do not contribute to the total resources required by the servers to compress and 

20 serve the web page. 
21 
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1 In other embodiments, the system 100 does not include a decoder 118. 

2 These embodiments are known as "clientless", because many of the functions normally 

3 performed by the decoder 118 are performed differently. In such embodiments, the 

4 originating server 120 identifies that the client can receive a delta (e.g. via a cookie), and 

5 serves a deha instead of the document. The delta contains a reference to the template, 

6 w^hich can be served from either the mirroring servers 130 or from the originating server 

7 120. In such embodiments, the template is compressed once and cached at the mirroring 

8 server 130. 
9 

□ 

10 Figure 3 shows a data flow diagram in a system for efficient compression 

y 1 1 using differential caching. 

m 

0 12 

13 The system 300 includes a set of data flows for sending and receiving in- 

n 14 formation between the client devices 1 10, the originating server 120 and the mirroring 

O 1 5 server 130, using the communication link 140. It should be noted that it is not necessary 

1 6 to exhaust every data flow to achieve efficient differential caching. 

17 



1 8 A data flow 3 1 0 includes messages sets between the client device 1 1 0 un- 

1 9 der the control of the user 1 1 2 and the mirroring server 130. 
20 

21 Message set A from the client device 1 10 to the mirroring server 130 in- 

22 eludes requests for web pages. 
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1 Message set B from the mirroring server 130 to the client device 1 10 in- 

2 eludes the following: 

3 • compressed templates 

4 • compressed delta information. 
5 

6 A data flow 320 includes messages sets between the mirroring server 130 to 

7 the originating server 120. 
8 

9 Message set C from the mirroring server 130 to the originating server 120 

2 10 includes the following: 

yj 11 • requests for compressed templates that are not available in the cache of com- 

^ 12 pressed templates 132 

□ 13 • requests for delta information if the delta information is not available at the 

Q 14 mirroring server 130 

5 15 

16 Message set D from the originating server 120 to the mirroring server 130 

17 includes the following: 

18 • compressed templates 

19 • compressed delta information 
20 



Express mailing EL734815879US 



155.1006.01 

1 A data flow 330 includes messages sets between the originating server 120 

2 and the client device 110. 
3 

4 Message set E from the client device 1 10 to the originating server 120 in- 

5 eludes the following: 

6 • requests for compressed templates 

7 • requests for compressed delta information. 
8 

^ 9 Message set F from the originating server 120 to the cUent device 1 10 includes 

^ 10 the following: 

m 

W 11 • compressed templates 

Q 

,^12 • compressed delta information. 
O 13 

14 Figure 4 shows a data flow diagram for efficient compression using differ- 

Q 15 ential caching using a proxy encoder server. 
16 

17 A method 400 for efficient compression using differential caching is per- 

18 formed by the system 100, including a set of client devices 1 10, an originating server 

19 120, a set of mirroring servers 130, a proxy encoder 150 and a communication network 

20 140. 
21 
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1 Similar to method 200, the steps described herein can be performed concur- 

2 rently or in parallel by the system elements, or could be performed in a different sequence 

3 or some combination thereof 
4 

5 At a flow point 405, a user 1 12 is ready to request a web page from a mir- 

6 roring server 130. In a preferred embodiment, the mirroring server 130 is relatively more 

7 proximate to the user 1 12 than the originating server 120. 
8 

9 In a step 410, the user 112 requests the web page. In a preferred embodi- 

10 ment, this request is made using the decoder 118. The decoder 118 intercepts the request 

n i 

y 1 1 and redirects it to the proxy encoder server 1 50. The decoder 1 1 8 also informs the proxy 

m 

O 12 encoder server 150 that the client 110 can receive delta and template information, and 

%i 13 provide delta encoding. 
5 14 

- 5 

□ 15 In other embodiments, the request can be made without using the decoder 

16 118. In these other embodiments, the request goes directly to the proxy encoder server 

17 150. The absence of a decoder 118 indicates that the client cannot provide delta encod- 

18 ing. 
19 

20 In a step 415, the proxy encoder server 150 obtains the web page or a tem- 

21 plate corresponding to the web page from the mirroring server or originating server. 

22 The-proxy encoder server 150 compares the web page or template with information in 
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1 the cache of compressed templates 133 (as noted supra, in such embodiments, these 

2 caches are local to the proxy encoder server 150). If there is not a corresponding tem- 

3 plate in the cache of compressed templates 126 (as may be the case if the web page was 

4 never requested before), then a new template is generated, compressed and cached. If 

5 there is a corresponding template in the cache of compressed templates 126 and that cor- 

6 responding template needs to be updated, then updating is performed at this time and the 

7 updated template is compressed and cached. The compressed template may be tagged 

8 with information specifying a version number associated with the template. 
_ 9 

% 10 In those embodiments that do not provide delta encoding, the compressed 

yj 1 1 template may be stored at the client 110. In such embodiments, the proxy encoder server 

Q 12 140 calculates the difference between the compressed template at the client 110 and the 

13 fresh page and calculates a delta. The delta is an HTML page that includes a reference to 

□ 14 a compressed template (preferably a Javascript) and some Javascript instructions. In this 

P 15 embodiment, the Javascript instructions tell the client 1 10 how to transform the template 

1 6 into the updated web page. 

17 

18 In a step 420, the proxy encoder server 150 sends the compressed template 

19 to the client. This compressed template may be stored at the client 110. 
20 

21 The following steps occur when the user 112 requests the same web page 

22 associated with the compressed and cached template: 
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1 In a step 425, the user 1 12 requests a web page using the decoder 118. In 

2 making the request, the decoder 1 18 also specifies what version of a compressed template 

3 for the web page has been received in the past. This request is directed to the proxy en- 

4 coder server 150. 
5 

6 In a step 430, the proxy encoder server 150 receives the request and identi- 

7 fies a compressed template that is responsive to the request. The proxy encoder server 

8 150 calculates the difference between the version of the compressed template and a pre- 

9 sumably fresher version of the web page that is obtained from the originating server 120 

'^10 or the mirroring server 130. This difference is the delta information. 

'& 

lU 

Ui 11 

□ 12 In step 435, the proxy encoder sends delta information such as is responsive 

13 to the request. In some embodiments, the delta information is compressed before send- 

n 14 ing. In other embodiments, the delta information is compressed and cached, 
p 15 

16 In the "clientless" version, the proxy encoder server 150 does not know 

17 what version of the template is at the client. The proxy encoder 150 makes this decision 

18 and instructs the client to use a specific version of the template. This is comparable to 

19 steps 310 -335. 
20 

21 
22 
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1 Alternative Embodiments 
2 

3 Although preferred embodiments are disclosed herein, many variations are 

4 possible which remain within the concept, scope and spirit of the invention; these varia- 

5 tions would be clear to those skilled in the art after perusal of this application. One such 

6 variation includes caching (either at the originating server, mirroring server or proxy en- 

7 coder server) a Huffman tree corresponding to a web page that has been requested in the 

8 past. In such altemative embodiments, the delta is calculated by comparing the Huffman 

9 tree to newer versions of the tree and computing a delta based upon those parts of the tree 

10 that have changed. This technique is preferable for web pages that are requested very 

1 1 frequently (or that change very rarely). In other embodiments, a Huffman tree corre- 

12 sponding to a template is generated and served separate from the delta information. 



Express mailing EL734815879US 



